An Inductive Learning System for XML Documents

نویسنده

  • Xiaobing Wu
چکیده

This paper presents a complete inductive learning system that aims to produce comprehensible theories for XML document classifications. The knowledge representation method is based on a higherorder logic formalism which is particularly suitable for structured-data learning systems. A systematic way of generating predicates is also given. The learning algorithm of the system is a modified standard decision-tree learning algorithm driven by predicate/recall breakeven point. Experimental results on XML version of Reuters dataset show that this system is able to produce comprehensible theories with high precision/recall breaken point values.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automating XML markup of text documents

We present a novel system for automatically marking up text documents into XML and discuss the benefits of XML markup for intelligent information retrieval. The system uses the Self-Organizing Map (SOM) algorithm to arrange XML marked-up documents on a twodimensional map so that similar documents appear closer to each other. It then employs an inductive learning algorithm C5 to automatically ex...

متن کامل

Automating XML Markup using Machine Learning Techniques

In this paper we present a novel system for automatically marking up text documents into XML. The system uses the techniques of the Self-Organising Map (SOM) algorithm in conjunction with an inductive learning algorithm, C5.0. The SOM algorithm clusters the XML marked-up documents on a two-dimensional map such that documents having similar content are placed close to each other. The C5.0 algori...

متن کامل

Auto-tagging of Text Documents into XML

In this paper we present a novel system which automatically converts text documents into XML by extracting information from previously tagged XML documents. The system uses the Self-Organizing Map (SOM) learning algorithm to arrange tagged documents on a two-dimensional map such that nearby locations contain similar documents. It then employs the inductive learning algorithm C5.0 to automatical...

متن کامل

Automating XML mark-up using a two stage machine learning technique

We introduce a novel two-stage automatic XML mark-up system, which combines the WEBSOM approach to document categorisation in conjunction with the C5 inductive learning algorithm. The WEBSOM method clusters the XML marked-up documents such that semantically similar documents lie close together on a Self-Organising Map (SOM). The C5 algorithm automatically learns and applies mark-up rules derive...

متن کامل

AutoMarkup: A Tool for Automatically Marking up Text Documents

In this paper we present a novel system that can automatically mark up text documents into XML. The system uses the Self-Organizing Map (SOM) algorithm to organize marked documents on a map so that similar documents are placed on nearby locations. Then by using the inductive learning algorithm C5, it automatically generates and applies the markup rules from the nearest SOM neighbours of an unma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007